Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Prompting LLMs for complex tasks (e.g., building a trip advisor chatbot) needs humans to clearly articulate customized requirements (e.g., “start the response with a tl;dr”). However, existing prompt engineering instructions often lack focused training on requirement articulation and instead tend to emphasize increasingly automatable strategies (e.g., tricks like adding role-plays and “think step-by-step”). To address the gap, we introduce Requirement-Oriented Prompt Engineering (ROPE), a paradigm that focuses human attention on generating clear, complete requirements during prompting. We implement ROPE through an assessment and training suite that provides deliberate practice with LLM-generated feedback. In a randomized controlled experiment with 30 novices, ROPE significantly outperforms conventional prompt engineering training (20% vs. 1% gains), a gap that automatic prompt optimization cannot close. Furthermore, we demonstrate a direct correlation between the quality of input requirements and LLM outputs. Our work paves the way to empower more end-users to build complex LLM applications.more » « lessFree, publicly-accessible full text available April 24, 2026
-
Many college students drop STEM majors after struggling in gateway courses, in part because these courses place large demands on students9 time. In three online experiments with two different lessons (measures of central tendency and multiple regression), we identified a promising approach to increase the efficiency of STEM instruction. When we removed lectures and taught participants exclusively with practice and feedback, they learned at least 15% faster. However, our research also showed that this instructional strategy has the potential to undermine interest in course content for less-confident students, who may be discouraged when challenged to solve problems without upfront instruction and learn from their mistakes. If researchers and educators can develop engaging and efficacy-building activities that replace lectures, STEM courses could become better learning environments.more » « lessFree, publicly-accessible full text available January 27, 2026
-
Research spanning nearly a century has found that math plays an important role in the learning of chemistry. Here, we use a large dataset of student interactions with online courseware to investigate the details of this link between math and chemistry. The activities in the courseware are labeled against a list of knowledge components (KCs) covered by the content, and student interactions are tracked over a full semester of general chemistry at a range of institutions. Logistic regression is used to model student performance as a function of the number of opportunities a student has taken to engage with a particular KC. This regression analysis generates estimates of both the initial knowledge and the learning rate for each student and each KC. Consistent with results from other domains, the initial knowledge varies substantially across students, but the learning rate is nearly the same for all students. The role of math is investigated by labeling each KC with the level of math involved. The overwhelming result from regressions based on these labels is that only the initial knowledge varies strongly across students and across the level of math involved in a particular topic. The student learning rate is nearly independent of both the level of math involved in a KC and the prior mathematical preparation of an individual student. The observation that the primary challenge for students lies in initial knowledge, rather than learning rate, may have implications for course and curriculum design.more » « lessFree, publicly-accessible full text available November 12, 2025
-
Benjamin, Paaßen; Carrie, Demmans Epp (Ed.)What does it mean to be a better model? One conceptualization, indeed a common one in Educational Data Mining, is that a better model is the one that fits the data better, that is, higher prediction accuracy. However, oftentimes, models that maximize prediction accuracy do not provide meaningful parameter estimates. Here we argue that models that provide meaningful parameters are better models and, indeed, often also provide higher prediction accuracy. To illustrate our argument, we investigate the Performance Factors Analysis (PFA) model and the Additive Factors Model (AFM). PFA often has higher prediction accuracy than the AFM; however, PFA申fs parameter estimates are ambiguous and confounded. We propose more interpretable models (AFMh and PFAh) designed to address the confounded parameters and demonstrate PFA申fs confounding issues with synthetic data. The results from the experiment with 27 real-world dataset also support our claims and show that the more interpretable models can produce better predictions.more » « less
-
Visual thinking with diagrams is a crucial skill for learning and problem-solving in STEM subjects. To improve in this area, students need a variety of visual problems for deliberate practice. However, in our interviews, educators shared that they struggle to create these practice exercises because of limitations of existing tools. We introduce Edgeworth, a tool designed to help educators easily create visual problems. Edgeworth works in two main ways: firstly, it takes a single diagram from the user and systematically alters it to produce many variations, which the educator can then choose from to create multiple problems. Secondly, it automates the layout of diagrams, ensuring consistent high quality without the need for manual adjustments. To assess Edgeworth, we carried out case studies, a technical evaluation, and expert walkthrough demonstrations. We show that Edgeworth can create problems in three domains: geometry, chemistry, and discrete math. These problems were authored in just 15 lines of Edgeworth code on average. Edgeworth generated usable answer options within the first 10 diagram variations in 87% of authored problems. Finally, educators gave positive feedback on Edgeworth's utility and the real-world applicability of its outputs.more » « less
-
Benjamin, Paaßen; Carrie, Demmans Epp (Ed.)There is a growing community of researchers at the intersection of data mining, AI, and computing education research. The objective of the CSEDM workshop is to facilitate a discussion among this research community, with a focus on how data mining can be uniquely applied in computing education research. For example, what new techniques are needed to analyze program code and CS log data? How do results from CS education inform our analysis of this data? The workshop is meant to be an interdisciplinary event at the intersection of EDM and Computing Education Research. Researchers, faculty, and students are encouraged to share their AI- and data-driven approaches, methodologies, and experiences where data transforms how students learn Computer Science (CS) skills. This full-day hybrid workshop will feature paper presentations and discussions to promote collaboration.more » « less
-
Large Language Models (LLMs) now excel at generative skills and can create content at impeccable speeds. However, they are imperfect and still make various mistakes. In a Computer Science education context, as these models are widely recognized as “AI pair programmers,” it becomes increasingly important to train students on evaluating and debugging the LLM-generated code. In this work, we introduce HypoCompass, a novel system to facilitate deliberate practice on debugging, where human novices play the role of Teaching Assistants and help LLM-powered teachable agents debug code. We enable effective task delegation between students and LLMs in this learning-by-teaching environment: students focus on hypothesizing the cause of code errors, while adjacent skills like code completion are offloaded to LLM-agents. Our evaluations demonstrate that HypoCompass generates high-quality training materials (e.g., bugs and fixes), outperforming human counterparts fourfold in efficiency, and significantly improves student performance on debugging by 12% in the pre-to-post test.more » « less
-
Intelligent science exhibits: Transforming hands-on exhibits into mixed-reality learning experiencesMuseum exhibits encourage exploration with physical materials typically with minimal signage or guidance. Ideally children get interactive support as they explore, but it is not always feasible to have knowledgeable staff regularly present. Technology-based interactive support can provide guidance to help learners achieve scientific understanding for how and why things work and engineering skills for designing and constructing useful artifacts and for solving important problems. We have developed an innovative AI-based technology, Intelligent Science Exhibits that provide interactive guidance to visitors of an inquiry-based science exhibit. We used this technology to investigate alternative views of appropriate levels of guidance in exhibits. We contrasted visitor engagement and learning from interaction with an Intelligent Science Exhibit to a matched conventional exhibit. We found evidence that the Intelligent Science Exhibit produces substantially better learning for both scientific and engineering outcomes, equivalent levels of self-reported enjoyment, and higher levels of engagement as measured by the length of time voluntarily spent at the exhibit. These findings show potential for transforming hands-on museum exhibits with intelligent science exhibits and more generally indicate how providing children with feedback on their predictions and scientific explanations enhances their learning and engagement.more » « less
An official website of the United States government

Full Text Available